AITopics | sgd learn recurrent neural network

Collaborating Authors

sgd learn recurrent neural network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Neural Information Processing SystemsJan-24-2025, 09:00:54 GMT

This paper show that Elman RNNs optimized with vanilla SGD can learn concepts where the target output at each position of the sequence is any function of the previous L inputs that can be encoded in a two-layer smooth neural network. There are multiple assumptions and complications in showing the main result. The crux of the proof is to show that if the RNN is overparameterized enough, then if we start from a randomly initialized RNN matrix W, there exists a function which is linear in matrix W* whose value at a specific W* is a good approximation to the target in the concept class. Showing that SGD moves in a direction similar to such W* gives the desired result. Another interesting aspect of the main result is that the number of samples that SGD needs depends only logarithmically with respect to the number of RNN neurons, making it applicable to overparameterized settings.

concept class, provable generalization, sgd learn recurrent neural network, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Reviews: Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Neural Information Processing SystemsJan-24-2025, 09:00:43 GMT

This paper provides theory that explains what functions can be learned using RNNs (beyond linear classifiers) and gives sample complexity bounds. All reviewers agree that this result is significant and therefore, I recommend acceptance.

provable generalization, sgd learn recurrent neural network

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Neural Information Processing SystemsOct-10-2024, 05:06:28 GMT

Recurrent Neural Networks (RNNs) are among the most popular models in sequential data analysis. Yet, in the foundational PAC learning language, what concept class can it learn? Moreover, how can the same recurrent unit simultaneously learn functions from different input tokens to different output tokens, without affecting each other? In this paper, we show using the vanilla stochastic gradient descent (SGD), RNN can actually learn some notable concept class \emph{efficiently}, meaning that both time and sample complexity scale \emph{polynomially} in the input length (or almost polynomially, depending on the concept). This concept class at least includes functions where each output token is generated from inputs of earlier tokens using a smooth two-layer neural network.

provable generalization, recurrent neural network, sgd learn recurrent neural network, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Allen-Zhu, Zeyuan, Li, Yuanzhi

Neural Information Processing SystemsMar-19-2020, 00:48:20 GMT

provable generalization, recurrent neural network, sgd learn recurrent neural network, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback